Reliability and Validity in Nursing Research

 

P. Tamilselvi1, G. Ramamurthy2

1Reader, Adhi Parasakthi College of Nursing, Melmaruvathur

2Staff Nurse, JIPMER, Puducherry

*Corresponding Author Email: selvitamil79@gmail.com

 

 


INTRODUCTION:

Researchers face numerous challenges in conducting research. Researchers want their findings to reflect the truth that is relevant, accurate and sensitive. Advances in health sciences research depends on reliability and validity. It lies at the heart of competent and effective study. Competent researchers often not only fail to report the reliability of their measures

(Henson, 2001; Thompson, 1999), but also fall short of grasping the inextricable link between scale validity and effective research.

 

Reliability1

Reliability is the consistency with which an instrument measures the attribute. It also concerns a measure’s accuracy. An instrument is reliable to the extent that is measures reflect true scores—that is, to the extent that measurement errors are absent from obtained scores. A reliable instrument maximizes the true score component and minimizes the error component of a obtained score

 

Definition

The reliability of an instrument is the degree of consistency with which the instrument measures the target attribute. -Polit   and Hungler1.

 

Concept of   Reliability2

The concept of reliability in relation to a research instrument has a similar meaning. If a research tool is consistent and stable, and hence predictable and accurate it is said to be reliable. The greater the degree of consistency and stability in an instrument is it’s greater the reliability.

The concept of reliability can be looked at from two sides

1.     How reliable is an instrument?

2.     How unreliable is it?

 

The first question focuses on the ability of an instrument to produce consistent measurements when you collect the same set of information more than once, using the same instrument and get the same or similar results, under the same or similar conditions, an instrument is considered to be reliable2

 

The second question focuses on the degree of inconsistency in the measurements made by an instrument, that is, the extent of difference in the measurements when you collect the same set of information more than once, by using the same instruments under the same or similar conditions2

 

Methods of determining the reliability of an instrument:

Stability:

The stability of- a measure is the extent to which the same scores are obtained when the instrument is used with the same people on separate occasions.

Assessments of stability are derived through test-retest reliability procedures. The researcher administers the same measures to a sample of people on two occasions and then compares the score.

The extent to which similar results are obtained on two separate administrations the reliability estimate focuses on the instrument susceptibility to extraneous factors over time (-1 to 1)

 

Equivalence;

The equivalence approach to estimating reliability-used primarily with structured observational instruments-determine the consistency or equivalence of the instrument by different observers or raters.

The degree of error can be accessed through interrater reliability which is estimated by having two or more trained observer’s makes simultaneous independent observations. The resulting data can then be used to calculate an index of equivalence or agreement. That is a reliability coefficient can be computed to demonstrate the strength of the relationship between the observer ratings. When two independent observers score some phenomena congruently the score are likely to be accurate and reliable.

 

Internal consistency;

Internal Consistency is the extent to which test or procedures assess the same characteristics, skills or quality. It is a measure of the precision between the observers or the measuring instruments used in a study. Split-half technique used for measuring the reliability of an instrument.

 

Split-half technique is designed to correlate half of the items with other half and is appropriate for instrument that is designed to measure attitudes towards an issue or phenomenon. The questions or statements intended to measure the same aspects falls in to two halves because the product movement correlation  is calculated on the basis of only half the instrument to assess the reliability for the whole if needs to be corrected. This is known as stepped –up reliability.

 

Formula called Spearman-Brown Formula.

Reliability of the whole test=2(reliability of half test)/1+ reliability of half test

 

Factors Affecting Reliability of a Research Instrument2

1.     The wording of questions a slightly ambiguity in the wording of the questions or statement can affect the reliability of the measuring instrument as respondent may interpret the questions differently at different times resulting in different responses.

2.     The physical setting; any change in physical settings at the time of repeated interview may affect the responses given by respondent, which may affect the reliability.

3.     The respondent’s mood; any change in a respondent’s mood when responding to questions or writing answer in a questionnaire can affect the reliability of an instrument.

 

Validity in Research

Validity is the appropriateness, meaning, fullness and usefulness of the interference made from the scoring of the instrument.

 

Types of Validity3

Face Validity involves an overall look of an instrument regarding its appropriateness to measure a particular attribute or phenomenon. Though face validity is not considered a very important and essential type of validity for an instrument researcher may judge the face value of this instrument by its appearance, that is it looks good or not, but it provides no guarantee about the appropriateness and completeness of a research instrument with regards to its content, construct, and measurement score.

 

Content Validity is concerned with scope of coverage of the content AREA to be measured. It is applied in tests of knowledge measurement. It is mostly used in measuring complex psychological tests of  a person. Judgment of the content validity may be subjective and are based on previous researchers and experts opinion about the adequacy, appropriateness, and completeness of the content of instrument.

 

Criterion Validity  is a relationship between measurements of the instrument with some other external criteria. Criterion- related validity may be differentiated by predictive and concurrent validity.

 

(i)Predictive validity is the ability of an assessment measure to predict someone’s future behavior in related but different, situation. An assessment measure with high predictive validity is capable of making accurate predictions of future behavior. Low predictive validity means that a measure is of little use in predicting a particular behavior.

 

(ii)Concurrent validity reflects how well different measures of the same trait agree with another.  If a test possesses high degree of concurrent validity, then it can be expected to give results very similar to other measures of same characteristics.

 

Formative Validity when applied to outcomes assessment it is used to assess how well a measure is able to provide information to help improve the program under study.

 

Sampling Validity ensures that the measure covers the broad range of areas within the concept under study.  Not everything can be covered, so items need to be sampled from all of the domains.  This may need to be completed using a panel of “experts” to ensure that the content area is adequately sampled.  Additionally, a panel can help limit “expert” bias

 

Construct validity construct validity occurs when the theoretical constructs of cause and effect accurately represent the real-world situations they are intended to model. This is related to how well the experiment is operationalized. A good experiment turns the theory (constructs) into actual things you can measure. Sometimes just finding out more about the construct (which itself must be valid) can be helpful.

 

Construct validity is thus an assessment of the quality of an instrument or experimental design. It says 'Does it measure the construct it is supposed to measure'. If you do not have construct validity, you will likely draw incorrect conclusions from the experiment (garbage in, garbage out).

 

(i) Convergent validity4

Convergent validity occurs where measures of constructs that are expected to correlate do so. This is similar to concurrent validity (which looks for correlation with other tests).

 

(ii) Discriminant validity

Discriminant validity occurs where constructs that are expected not to relate do not, such that it is possible to discriminate between these constructs.

Convergence and discrimination are often demonstrated by correlation of the measures used within constructs.

Convergent validity and Discriminant validity together demonstrate construct validity.

 

(iii) Nomological network3

Defined by Cronbach and Meehl, this is the set of relationships between constructs and between consequent measures. The relationships between constructs should be reflected in the relationships between measures or observations.

 

(iv) Multitrait-Multimethod Matrix (MTMM)4

Defined by Campbell and Fiske, this demonstrates construct validity by using multiple methods (eg. survey, observation, test) to measure the same set of 'traits' and showing correlations in a matrix, where blocks and diagonals have special meaning.

 

Internal validity

Internal validity occurs when it can be concluded that there is a causal relationship between the variables being studied. It is related to the design of the experiment, such as in the use of random assignment of treatments.

 

Conclusion validity

Conclusion validity occurs when you can conclude that there is a relationship of some kind between the two variables being examined.

This may be positive or negative correlation.

 

External validity

External validity occurs when the causal relationship discovered can be generalized to other people, times and contexts.

Correct sampling will allow generalization and hence give external validity.

 

Factors affecting internal validity7

     Subject variability

     Size of subject population

     Time given for the data collection or experimental treatment

     History

     Attrition

     Maturation

     Instrument/task sensitivity                                                                                        

 

Seven important factors affecting external validity5, 6, 7

     Population characteristics (subjects)

     Interaction of subject selection and research

     Descriptive explicitness of the independent variable

     The effect of the research environment

     Researcher or experimenter effects

     Data collection methodology

     The effect of time

 

Ways to improve validity and reliability2,3

1.     Make sure your goals and objectives are clearly defined and operationalized.  Expectations of students should be written down.

2.     Match your assessment measure to your goals and objectives. Additionally, have the test reviewed by faculty at other schools to obtain feedback from an outside party who is less invested in the instrument.

3.     Get students involved; have the students look over the assessment for troublesome wording, or other difficulties.

4.     If possible, compare your measure with other measures, or data that may be available.

 

REFERENCES:

1.        Denise f. polit and Cheryl Tatano Beck. “Essential of nursing research”. 7th edition. Lippincott Williams & Wilkins.2009. Page 373-376

2.        Carmen G. Loiselle and Joanne Profetto-Mccrrath. “Canadian Essentials of nursing research”. 2004. Page 307-311

3.        Ranjith Kumar. “Research methodology-a step-by step Guide for beginners”. Sage publications.Newdelhi.1999.page136-143

4.        Suresh Sharma. “Nursing research and statistics”. Elsevier.2011. page216-218

5.        linguistics.byu.edu/faculty/research methodology

6.        econ.upm.edu.my/research

7.        Changing minds.org/explanations/research design.

 

 

 

 

Received on 16.06.2013          Modified on 28.09.2013

Accepted on 01.10.2013          © A&V Publication all right reserved

Asian J. Nur. Edu. and Research 3(4): Oct.- Dec., 2013; Page 270-272